Skip to content

fix(fetch): use the correct FETCH_HEAD from within a worktree #1108

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 14, 2021
Merged

fix(fetch): use the correct FETCH_HEAD from within a worktree #1108

merged 1 commit into from
Jan 14, 2021

Conversation

muggenhor
Copy link
Contributor

@muggenhor muggenhor commented Jan 13, 2021

FETCH_HEAD is one of the symbolic references local to the current
worktree and as such should not be looked up in the common_dir. But
instead of just hard coding the "right thing" (git_dir) lets defer this
to the SymbolicReference class which already contains this knowledge in
its abspath property.

This was introduced by #654. I suspect an overzealous search & replace of git_dir -> common_dir was the cause here.

FYI: this bug can be reproduced by this Python script. I've not been able to convert that into a test case though:

import logging
import os
import os.path
from tempfile import TemporaryDirectory

import git


logging.basicConfig(
    level=logging.DEBUG,
)

with TemporaryDirectory() as tmpdir:
    # create repo to have something to fetch from
    with git.Repo.init(os.path.join(tmpdir, "source"), expand_vars=False) as src_repo:
        author = git.Actor('Bob Tester', '[email protected]')
        src_repo.index.commit(
            message="Initial commit",
            author_date="0 +0000",
            commit_date="0 +0000",
            author=author,
            committer=author,
        )

    with git.Repo.clone_from(src_repo.working_dir, os.path.join(tmpdir, "test")) as repo:
        fetchinfo, = repo.remotes.origin.fetch(["master"])

        subtree = "some-subdir"
        repo.git.worktree("add", subtree, fetchinfo.ref.commit)

        # pollute the top-level FETCH_HEAD so it'll definitely not match the fetch we're about to do below
        repo.remotes.origin.fetch([fetchinfo.ref.commit.hexsha for _ in range(5)])

        with git.Repo(os.path.join(repo.working_dir, subtree), expand_vars=False) as subrepo:
            # throws due to reading $(cat .git/common_dir)/.git/FETCH_HEAD instead of .git/FETCH_HEAD
            subrepo.remotes.origin.fetch(["master"])

FETCH_HEAD is one of the symbolic references local to the current
worktree and as such should _not_ be looked up in the 'common_dir'. But
instead of just hard coding the "right thing" (git_dir) lets defer this
to the SymbolicReference class which already contains this knowledge in
its 'abspath' property.
@Byron Byron added this to the v3.1.13 - Bugfixes milestone Jan 14, 2021
@Byron
Copy link
Member

Byron commented Jan 14, 2021

Thanks a lot for the fix, the thorough PR description and for trying to even put this into a test.

No matter what, this fix should definitely be merged as I have a feeling it did break a lot of things downstream and it's surprising there weren't any bug reports yet.

@Byron Byron merged commit 037d62a into gitpython-developers:master Jan 14, 2021
@muggenhor muggenhor deleted the fix/fetch-parse-fetch-head-from-worktree branch January 14, 2021 10:30
@muggenhor
Copy link
Contributor Author

muggenhor commented Jan 14, 2021

... and it's surprising there weren't any bug reports yet.

I think I've got some idea why there weren't any yet. My team's been encountering this problem for at least as early as about June 2019 (unfortunately our CI logs don't go further than that). But it was very rare, and we couldn't consistently reproduce it.

Then somewhere early in 2020 the frequency of this happening jumped up to close to 100% but our initial attempts to diagnose it where unsuccessful and we had a simple workaround (disabling the use of multiple nodes in our Jenkins setup to prevent the need to fetch into worktrees from one node to another). My colleague who did that investigation felt, at the time, uncomfortable filing a bug report without having a good reproduction scenario. I later discovered that the cause of this frequency jump was a commit to our meta-CI system tomtom-international/hopic@14a4338 (that change causes the introduction of a conditional, but almost always occurring, extra repo.remotes.origin.fetch([list-of-commit-hashes]) on newly allocated build slaves, just before fetching into any configured worktrees).

It's the combination of fetching first in the top-level and only then fetching in a worktree that may cause this bug to cause an exception to be raised. In plenty of other cases you may just silently get wrong results being returned from remote.fetch(...). That makes it difficult to detect at all let alone consistently reproduce.

Then the error we got itself (and the above reproduction script will give with the old version) is also misleading: it's a TypeError suggesting that GitPython is somehow unable to parse a stderr or FETCH_HEAD line. So the direction of investigation my colleague took was trying to figure out what line that was and how to get it to that piece of code.

I still believe there's at least one bug, maybe two, in the processing of stderr & FETCH_HEAD there. But this was the one I managed to disentangle yesterday after pure curiosity + frustration motivated me sufficiently to do a deep dive for close to two full days on this one. I'll create two bug reports for those right away, even if the information I can provide at this time is very partial at this time.

I hope that bit of history helps to understand at least partially why this hasn't been reported before. The (feeling of) inability to describe what the problem is is a big obstacle there.

Anyway, thanks for your quick response :-)

@Byron
Copy link
Member

Byron commented Jan 14, 2021

Thanks so much for sharing, but… oh my…every paragraph I could feel the time spent and nerves lost and it hurt :/.
Having maintained GitPython forever now I do know for sure that it is not at all 'good' software, and many more bugs an issues are lurking to cause nasty surprises to people who definitely shouldn't have to deal with them.

Even though for GitPython I do believe the fight is lost, there is another endeavour which will definitely reach an entirely new level of quality. Maybe one day it will even arrive in Python.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants